3 research outputs found

    Finding role communities in directed networks using Role-Based Similarity, Markov Stability and the Relaxed Minimum Spanning Tree

    Full text link
    We present a framework to cluster nodes in directed networks according to their roles by combining Role-Based Similarity (RBS) and Markov Stability, two techniques based on flows. First we compute the RBS matrix, which contains the pairwise similarities between nodes according to the scaled number of in- and out-directed paths of different lengths. The weighted RBS similarity matrix is then transformed into an undirected similarity network using the Relaxed Minimum-Spanning Tree (RMST) algorithm, which uses the geometric structure of the RBS matrix to unblur the network, such that edges between nodes with high, direct RBS are preserved. Finally, we partition the RMST similarity network into role-communities of nodes at all scales using Markov Stability to find a robust set of roles in the network. We showcase our framework through a biological and a man-made network.Comment: 4 pages, 2 figure

    Interest communities and flow roles in directed networks: the Twitter network of the UK riots

    Full text link
    Directionality is a crucial ingredient in many complex networks in which information, energy or influence are transmitted. In such directed networks, analysing flows (and not only the strength of connections) is crucial to reveal important features of the network that might go undetected if the orientation of connections is ignored. We showcase here a flow-based approach for community detection in networks through the study of the network of the most influential Twitter users during the 2011 riots in England. Firstly, we use directed Markov Stability to extract descriptions of the network at different levels of coarseness in terms of interest communities, i.e., groups of nodes within which flows of information are contained and reinforced. Such interest communities reveal user groupings according to location, profession, employer, and topic. The study of flows also allows us to generate an interest distance, which affords a personalised view of the attention in the network as viewed from the vantage point of any given user. Secondly, we analyse the profiles of incoming and outgoing long-range flows with a combined approach of role-based similarity and the novel relaxed minimum spanning tree algorithm to reveal that the users in the network can be classified into five roles. These flow roles go beyond the standard leader/follower dichotomy and differ from classifications based on regular/structural equivalence. We then show that the interest communities fall into distinct informational organigrams characterised by a different mix of user roles reflecting the quality of dialogue within them. Our generic framework can be used to provide insight into how flows are generated, distributed, preserved and consumed in directed networks.Comment: 32 pages, 14 figures. Supplementary Spreadsheet available from: http://www2.imperial.ac.uk/~mbegueri/Docs/riotsCommunities.zip or http://rsif.royalsocietypublishing.org/content/11/101/20140940/suppl/DC

    Unravelling biological processes using graph theoretical algorithms and probabilistic models

    No full text
    This thesis develops computational methods that can provide insights into the behaviour of biomolecular processes. The methods extract a simplified representation/model from samples characterising the profiles of different biomolecular functional units. The simplified representation helps us gain a better understanding of the relations between the functional units or between the samples. The proposed computational methods integrate graph theoretical algorithms and probabilistic models. Firstly, we were interested in finding proteins that have a similar role in the transcription cycle. We performed a clustering analysis on an experimental dataset using a graph partitioning algorithm. We found groups of proteins associated with different stages of the transcription cycle. Furthermore, we estimated a network model describing the relations between the clusters and identified proteins that are representative for a cluster or for the relation between two clusters. Secondly, we proposed a computational framework that unravels the structure of a biological process from high-dimensional samples characterising different stages of the process. The framework integrates a feature selection procedure and a feature extraction algorithm in order to extract a low-dimensional projection of the high-dimensional samples. We analysed two microarray datasets characterising different cell types part of the blood system and found that the extracted representations capture the structure of the hematopoietic stem cell differentiation process. Furthermore, we showed that the low-dimensional projections can be used as a basis for analysis of gene expression patterns. Finally, we introduced the geometric hidden Markov model (GHMM), a probabilistic model for multivariate time series data. The GHMM assumes that the time series lie on a noisy low-dimensional manifold and infers a dynamical model that reflects the low-dimensional geometry. We analysed multivariate time series data generated with a stochastic model of a biomolecular circuit and showed that the estimated GHMM captures the oscillatory behaviour of the circuit.Open Acces
    corecore